SentiCap: Generating Image Descriptions with Sentiments
نویسندگان
چکیده
The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment.
منابع مشابه
Fusing Social Networks with Deep Learning for Volunteerism Tendency Prediction
Zobrist Hashing: An Efficient Work Distribution Method for Parallel Best-First Search Yuu Jinnai, Alex Fukunaga VIS: Text and Vision Oral Presentations 1326 SentiCap: Generating Image Descriptions with Sentiments Alexander Patrick Mathews, Lexing Xie, Xuming He 1950 Reading Scene Text in Deep Convolutional Sequences Pan He, Weilin Huang, Yu Qiao, Chen Change Loy, Xiaoou Tang 1247 Creating Image...
متن کاملMidge: Generating Image Descriptions From Computer Vision Detections
This paper introduces a novel generation system that composes humanlike descriptions of images from computer vision detections. By leveraging syntactically informed word co-occurrence statistics, the generator filters and constrains the noisy detections output from a vision system to generate syntactic trees that detail what the computer vision system sees. Results show that the generation syst...
متن کاملFrom Image Annotation to Image Description
In this paper, we address the problem of automatically generating a description of an image from its annotation. Previous approaches either use computer vision techniques to first determine the labels or exploit available descriptions of the training images to either transfer or compose a new description for the test image. However, none of them report results on the effect of incorrect label d...
متن کاملGenerating Image Descriptions with Gold Standard Visual Inputs: Motivation, Evaluation and Baselines
In this paper, we present the task of generating image descriptions with gold standard visual detections as input, rather than directly from an image. This allows the Natural Language Generation community to focus on the text generation process, rather than dealing with the noise and complications arising from the visual detection process. We propose a fine-grained evaluation metric specificall...
متن کاملGenerating Natural Video Descriptions via Multimodal Processing
Generating natural language descriptions of visual content is an intriguing task which has wide applications such as assisting blind people. The recent advances in image captioning stimulate further study of this task in more depth including generating natural descriptions for videos. Most works of video description generation focus on visual information in the video. However, audio provides ri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016